Strings are Python builtins datatype for handling text. They are immutable thus you can not add, remove or updated any character in the string. If you wish to perform these operations than you need to create a new string and assign the existing/new variable name to it.
String is a sequence of characters
.
In [ ]:
an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters.
Escape sequence | Hex value in ASCII | Character represented |
---|---|---|
\a | 07 | Alert (Beep, Bell) (added in C89)[1] |
\b | 08 | Backspace |
\f | 0C | Formfeed |
\n | 0A | Newline (Line Feed); see notes below |
\r | 0D | Carriage Return |
\t | 09 | Horizontal Tab |
\v | 0B | Vertical Tab |
\ | 5C | Backslash |
\' | 27 | Single quotation mark |
\" | 22 | Double quotation mark |
\? | 3F | Question mark (used to avoid trigraphs) |
\nnnnote 1 | any | The byte whose numerical value is given by nnn interpreted as an octal number |
\xhh… | any | The byte whose numerical value is given by hh… interpreted as a hexadecimal number |
\enote 2 | 1B | escape character (some character sets) |
\Uhhhhhhhhnote 3 | none | Unicode code point where h is a hexadecimal digit |
\uhhhhnote 4 | none | Unicode code point below 10000 hexadecimal |
Strings can be classified in 3 categories.
f
or F
. These strings may contain replacement fields, which are expressions delimited by curly braces {}
. While other string literals always have a constant value, formatted strings are really expressions evaluated at run time. {New in 3.6}String can be initialized using:
s = r '\ n'
, where s
will contain the characters \
and n
).Standard string is one in which the escape characters are processed and executed. They are by default unicode strings.
Since Python 3, strings are by default unicode string.
In [1]:
#### Standard String Examples:
friend = 'Chandu\tNalluri'
print(friend)
In [2]:
manager_details = "# Roshan Musheer:\nExcellent Manager and human being."
print(manager_details)
Raw Strings on the other hand handle escape characters as normal characters and do not process them
a = r'Roshan\tMusheer'
# Roshan\tMusheeru = u'Björk'
In [7]:
a = r'Roshan\tMusheer'
print(a)
In [1]:
path = "C:\new_data\technical_jargons"
print(path)
path = R"C:\new_data\technical_jargons"
print(path)
NOTE: both r and R work the same way
F-String
are prefixed with f
or F
. These strings may contain replacement fields, which are expressions delimited by curly braces {}
. While other string literals always have a constant value, formatted strings are really expressions evaluated at run time. {New in 3.6}
In [16]:
s = 'Camel'
print(id(s))
In [9]:
a = 'Roshan\tMusheer'
print(a)
String concatenation is a process of joining two or more strings into a single string. As we have already discussed that string is an immutable datatype thus we have to create a new string for concatenation, what that means is the original strings will still remain the same and new one will be created using the texts from the originals.
There are multiple ways in which we can achive the concatenation. The most common method of achiving the concatenation, is to use +
operator.
Lets take an example, where we have three string's and lets try to concatenate them using it.
In [21]:
st_the = "The "
st_action = " ran away !!!"
st = st_the + s + st_action
print(st)
print(s)
print(st_the)
print(st_action)
print(id(st))
print(id(st_the))
print(id(s))
print(id(st_action))
In [3]:
print(dir(s))
string interpolation (or variable interpolation, variable substitution, or variable expansion) is the process of evaluating a string literal containing one or more placeholders, yielding a result in which the placeholders are replaced with their corresponding values.
In [22]:
print( 'Size of %s => %d' % (s, len(s)))
print(dir(s))
print( 'Size of %s => %d' % (s, s.__len__()))
def size(strdata):
c = 0
for a in strdata:
c+=1
return c
print(size("Anshu"))
It is the new Interpolation method as it is implemented in Python 3.6
.
In [3]:
name = 'World'
program = 'Python'
print(f'Hello {name}! This is {program}')
name = 'Ravi'
program = 'Python'
print(f'Hello {name}! This is {program}')
In [5]:
# String processed as a sequence
s = "Murthy "
for ch in s: print(ch , end=',') # This
# print(help(print))
print("\b.")
print("~"*79)
In [6]:
# Strings are objects
if s.startswith('M'): print(s.upper())
print(s.lower())
print("~"*79)
# what will happen?
print(3*s)
# print(dir(s))
In [7]:
s = " Murthy "
age = 5
print(s + str(age))
print(s.strip(), age)
# print(s + age)
In [17]:
st = " Mayank Johri "
print(len(st))
s = st.strip()
print(len(s))
print(st.rstrip())
print(st.lstrip())
In [13]:
m = "Mohan Shah"
x = ["mon", "tues", "wed"]
y = ","
a = "On Leave"
print(y.join(x)) # -> mon,tues,wed
print(m.join(y))
print(a.join(y))
print(y.join(a))
print(a.join(m))
Create a string from a list of string items
In [14]:
" ".join(x)
Out[14]:
In [15]:
book_desc = ["This", "book", "is good"]
" ".join(book_desc)
Out[15]:
The operator %
is used for string interpolation. The interpolation is more efficient in use of memory than the conventional concatenation.
Symbols used in the interpolation:
Symbols can be used to display numbers in various formats.
Example:
In [20]:
# Zeros left
print ('Now is %02d:%02d.' % (6, 30))
# Real (The number after the decimal point specifies how many decimal digits )
print ('Percent: %.1f%%, Exponencial:%.2e' % (5.333, 0.00314))
# Octal and hexadecimal
print ('Decimal: %d, Octal: %o, Hexadecimal: %x' % (10, 10, 10))
In [4]:
peoples = [('Mayank', 'friend', 'Manish'),
('Mayank', 'reportee', 'Roshan Musheer')]
# Parameters are identified by order
msg = '{0} is {1} of {2}'
for name, relationship, friend in peoples:
print(msg.format(name, relationship, friend))
In [10]:
# Parameters are identified by name
msg = '{greeting}, it is {hour:02d}:{minute:02d}'
print(msg.format(greeting='Good Morning', minute=2, hour=10))
print(msg)
# Builtin function format()
print ('Pi =', format(3.14159, '.3e'))
print ('Pi =', format(3.14159, '.1e'))
In [11]:
'{} {}'.format('सूर्य', 'नमस्कार')
Out[11]:
In [41]:
'{1} {0}'.format('सूर्य', 'नमस्कार')
Out[41]:
In [23]:
s = '{:>30}'.format('सूर्य नमस्कार')
print(s)
print(len(s))
In [19]:
s = '{:>2}'.format('सूर्य नमस्कार')
print(s)
print(len(s))
In [14]:
'{:20}'.format('सूर्य नमस्कार')
Out[14]:
In [49]:
'{:4}'.format('Bonjour')
Out[49]:
In [28]:
'{:^<5}'.format('Ja')
Out[28]:
In [58]:
'{:^7}'.format('こんにちは')
Out[58]:
In [ ]:
In [37]:
'{:.5}'.format('Bonjour')
Out[37]:
In [38]:
## ??????
In [36]:
s = '{:10.5}'.format('testdd नमस्कार')
print(len(s))
print(s)
In [ ]:
'{:10.5}'.format('Bonjour')
In [106]:
'{:{align}{width}}'.format('Bonjour', align='^', width='9')
Out[106]:
In [107]:
'{:.{prec}} = {:.{prec}f}'.format('Bonjour', 2.22, prec=4)
Out[107]:
In [66]:
'{:d}'.format(1980)
Out[66]:
In [67]:
'{:f}'.format(3.141592653589793)
Out[67]:
In [40]:
'{:2f}'.format(3.141592653589793)
Out[40]:
In [77]:
'{:04d}'.format(119)
Out[77]:
In [68]:
'{:06.2f}'.format(3.141592653589793)
Out[68]:
In [78]:
'{:+d}'.format(119)
Out[78]:
In [79]:
'{:+d}'.format(-119)
Out[79]:
In [86]:
### Need to find for complex & boolean numbers
## '{:+d+d}'.format(-3 + 2j)
In [89]:
'{:=5d}'.format((- 111))
Out[89]:
In [ ]:
In [90]:
'{: d}'.format(101)
Out[90]:
In [ ]:
In [92]:
'{name} {surname}'.format(name='Mayank', surname='Johri')
Out[92]:
In [ ]:
In [95]:
user = dict(name='Mayank', surname='Johri')
'{u[name]} {u[surname]}'.format(u=user)
Out[95]:
In [ ]:
In [97]:
lst = list(range(10))
'{l[2]} {l[7]}'.format(l=lst)
Out[97]:
In [ ]:
In [100]:
from datetime import datetime
'{:%Y-%m-%d %H:%M}'.format(datetime(2017, 12, 23, 14, 15))
Out[100]:
In [ ]:
In [ ]:
In [ ]:
In [31]:
class Yoga(object):
def __repr__(self):
return 'सूर्य नमस्कार'
In [35]:
'{0!r} <-> {0!a}'.format(Yoga())
Out[35]:
In [ ]:
In [ ]:
In [42]:
myStr = "maya Deploy, version: 0.0.3 "
print(myStr.capitalize())
print(myStr.center(60))
print(myStr.center(60, "*"))
print(myStr.center(10, "*"))
print(myStr.count('a'))
print(myStr.count('e'))
print(myStr.endswith('all'))
print(myStr.endswith('.0.3'))
print(myStr.endswith('.0.3 '))
print(myStr.find("g"))
print(myStr.find("e"))
Note: The find() method should be used only if you need to know the position of sub. To check if sub is a substring or not, use the in operator:
checking: substring in main_string : returns true or false
In [45]:
print("ma" in myStr)
In [46]:
print("M" in myStr)
In [60]:
c = "one"
print(c.isalpha())
c = "1"
print(c.isalpha())
In [20]:
superscripts = "\u00B2"
five = "\u0A6B"
five_punjabi = "੫"
ten_hindi = "१०"
num_one = "1"
one = "one"
fractions = "\u00BC"
In [15]:
print(superscripts)
print(five)
print(five_punjabi)
print(ten_hindi)
print(num_one)
print(one)
print(fractions)
In [17]:
print(superscripts.isdecimal())
print(five.isdecimal())
print(five_punjabi.isdecimal())
print(ten_hindi.isdecimal())
print(num_one.isdecimal())
print(one.isdecimal())
print(fractions.isdecimal())
In [12]:
print("10 ->", "10".isdecimal())
print("10.001".isdecimal())
In [13]:
str = u"this 2009";
print(str.isdecimal())
In [ ]:
In [23]:
# str.isdigit() (Decimals, Subscripts, Superscripts)
print(superscripts.isdigit())
print(five.isdigit())
print(five_punjabi.isdigit())
print(ten_hindi.isdigit())
print(num_one.isdigit())
print(one.isdigit())
print(fractions.isdigit())
In [24]:
print("10".isdigit())
str = u"this 2009";
print(str.isdigit())
str = u"23443.434";
print(str.isdigit())
In [26]:
print(superscripts.isnumeric())
print(five.isnumeric())
print(five_punjabi.isnumeric())
print(ten_hindi.isnumeric())
print(num_one.isnumeric())
print(one.isnumeric())
print(fractions.isnumeric())
In [ ]:
In [36]:
print(superscripts.isalnum())
print(five.isalnum())
print(five_punjabi.isalnum())
print(ten_hindi.isalnum())
print(num_one.isalnum())
print(one.isalnum())
print(fractions.isalnum())
ten_One = "10 One"
print(ten_One.isalnum())
tenOne = "10One"
print(tenOne.isalnum())
print("one".isalnum())
print("thirteen".isalnum())
In [ ]:
In [39]:
string1 = 'Hello'
string2 = 'hello'
if string1.lower() == string2.lower():
print("The strings are the same (case insensitive)")
else:
print("The strings are not the same (case insensitive)")
In [49]:
str_lower = "Σίσυφος"
str_upper = "ΣΊΣΥΦΟΣ"
if str_upper.lower() == str_lower.lower():
print("The strings are the same (case insensitive)")
else:
print("The strings are not the same (case insensitive)")
but fails in some cases
In [54]:
str_lower = "ß"
str_upper = "SS"
if str_upper.lower() == str_lower.lower():
print("The strings are the same (case insensitive)")
else:
print("The strings are not the same (case insensitive)")
So the best bet is using casefold
. Lets replace lower
to casefold
in the above example
In [57]:
str_lower = "ß"
str_upper = "SS"
if str_upper.casefold() == str_lower.casefold():
print("The strings are the same (case insensitive)")
else:
print("The strings are not the same (case insensitive)")
In [35]:
import string
# the alphabet
print(dir(string))
In [86]:
a = string.ascii_letters
print(a)
In [83]:
# Shifting left the alphabet
b = a[1:] + a[0]
print(b)
In [84]:
print(b.__doc__)
In [34]:
print(string.digits)
print(string.hexdigits)
print(help(string.printable))
In [ ]:
In [91]:
import string
# Creates a template string
st = string.Template('Dated: $when\n$warning occurred in $when $$$what $$what.')
# Fills the model with a dictionary
s = st.substitute({'warning': 'Lack of electricity',
'when': 'April 3, 2002',
'what': 'EOM'})
# Shows:
# Lack of electricity occurred in April 3, 2002
print(s)
In [1]:
# Unicode String
u = u'Hüsker Dü'
# Convert to str
s = u.encode('latin1')
print (s, '=>', type(s))
# String str
s = 'Hüsker Dü'
# u = s.decode('latin1')
print (repr(u), '=>', type(u))
To use both methods, it is necessary to pass as an argument the compliant coding. The most used are "latin1" "utf8".